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ABSTRACT 



Oh! 
I 



Aims. The aim of the LOFAR Epoch of Reionization (EoR) project is to detect the spectral fluctuations of the redshifted HI 21cm signal. This 
signal is weaker by several orders of magnitude than the astrophysical foreground signals and hence, in order to achieve this, very long integrations, 
^ ' accurate calibration for stations and ionosphere and reliable foreground removal are essential. 
5-H , Methods. One of the prospective observing windows for the LOFAR EoR project will be centered at the North Celestial Pole (NCP). We present 
. results from observations of the NCP window using the LOFAR highband antenna (HBA) array in the frequency range 115 MHz to 163 MHz. 
' The data were obtained in April 201 1 during the commissioning phase of LOFAR. We used baselines up to about 30 km. The data was processed 
I ' ' using a dedicated processing pipeline which is an enhanced version of the standard LOFAR processing pipeline. 

, Results. With about 3 nights, of 6 hours each, effective integration we have achieved a noise level of about 100 /jJy/PSF in the NCP window. 
. Close to the NCP, the noise level increases to about 180 yuJy/PSF, mainly due to additional contamination from unsubtracted nearby sources. We 
' estimate that in our best night, we have reached a noise level only a factor of 1 .4 above the thermal limit set by the noise from our Galaxy and the 
>— ^ ' receivers. Our continuum images are several times deeper than have been achieved previously using the WSRT and GMRT arrays. We derive an 
, analytical explanation for the excess noise that we believe to be mainly due to sources at large angular separation from the NCP. We present some 
details of the data processing challenges and how we solved them. 

Conclusions. Although many LOFAR stations were, at the time of the observations, in a still poorly calibrated state we have seen no artefacts in 
our images which would prevent us from producing deeper images in much longer integrations on the NCP window which are about to commence. 
f"*) , The limitations present in our current results are mainly due to sidelobe noise from the large number of distant sources, as well as eiTors related to 
, station beam variations and rapid ionospheric phase fluctuations acting on bright sources. We are confident that we can improve our results with 
refined processing. 

Key words. Instrumentation: interferometers - Techniques: interferometric - Cosmology: observations, diffuse radiation, reionization 

X 

, 1. Introduction (Giant Metrewave Radio Telescope). While MWA (Murchison 

" ■ ■ Widefield Array) and PAPER (Precision AiTay to Probe the EoR) 

A major epoch in the history of the Universe yet to be understood f^jj hardware d eployment yet, there a re still relevant 

in detail is its Dark Ages and the Epoch of Reionization (EoR). ^^^^^^ juced. In lOrd et al.l $20Wt> and IWilliams et al.l 

Observational evidence for this era can be gathered with high ^^^^ -^^-^^ widefield ima ges of the so uthern sky using 32 
probabihty by studying the fluctuations of the redshifted neutral j.^^^ presented. In Llacobs et al.l (l20TTl) . full sky im- 

hydrogen at redshifts corresponding to 6 < z < 12. Therefore, ^^^^^^ ^^^^j^ p^p^^ presented, 

there are numerous experiments becoming operational and al- 
ready collecting data, especially in the frequency range from 115 In preparing for the LOFAR EoR project we have conducted 
MHz to 240 MHz to reach this goal. several pilot experiments with the Low Frequency Frontends on 

At the forefr ont of such experirnents i s the Low Frequency the WSRT in a relevant frequency range: 138-157 MHz. The 
Array (LOFAR) (Ivan Haarlem et al ] l2013h . Similar EoR exper- results of these observations, and a d iscussion of their limita- 
iments u sing other radio tel escopes are already underway. For tions, have been described bv iBernardi et al. ( 2009. ,2010 ). For 
instance. IPaciga et alj ( 1201 ll) provide a new lower bound for the LOFAR in its commissioning phase we have adopted a multi- 
statistical detection threshold of HI fluctuations using the GMRT faceted observing strategy, building on the experience gained 
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from the WS RT data. The ra t ionale behind this is described in 
more detail in lde Bruvn et al.l (l2013h . A brief summary follows: 
Using LOFAR in its commissioning phase we have observed 
and processed three very diverse windows. One window contains 
a very bright compact source, 3C196, which allows exquisite ab- 
solute calibration as well as a study of the systematics at very 
high spectral and image dynamic range. At a declination of only 
48 degrees it will allow a study of elevation dependent effects. 
The many bright compact field sources around 3C196 also allow 
a study of ionospheric non-isoplanaticity. The results of these 
observations, with e mphasis on all those topic s, will be described 
in a separate paper (iLabropoulos et aL The second win- 

dow was chosen to ascertain possible damaging effects of faint 
signals due to instrumental leakage of bright polarized Galactic 
foreground sign als. These r esults , on the Elais Nl window, will 
be described bv lJelic et al.l(l2013h in the second paper in this se- 
ries. 

The third window, without bright sources and with relatively 
faint diffuse linearly polarized emission from our Galaxy, is ac- 
cessible throughout the year. This window is centered on the 
North Celestial Pole (NCP). It has been used to experiment with 
various calibration approaches as well as conduct a thorough 
analysis of the noise levels attainable with the current, still in- 
complete, LOFAR array. These results are the subject of the first 
paper in the series. 

The three windows described above have thus far been ob- 
served using a single digital beam with about 48 MHz band- 
width. Between these three windows we expect to address most 
of the issues that will affect much longer observations with 
LOFAR, which should go one order of magnitude deeper in 
noise level. The analysis of the results obtained in these three 
windows, as they pertain to EoR signal levels, will be discussed 
in more detail in subsequent publications. 

In this paper, we present results of LOFAR observations 
pointed at the NCP in the frequency range 115 MHz to 163 
MHz. The NCP was previously observed using the Westerbork 
Synthesis Radio Telescope (WSRT) in a similar frequency range, 
albeit with limited integration time and resolution. As reported 
by [Bernardi et al. (2010), the WSRT observations are mainly 
limited by broadband (and low level) radio frequency interfer- 
ence (RFI), and classical confusion (due to having limited < 3km 
longest baseline) that prevents reaching the theoretical noise 
level. 

LOFAR provides significant challenges as well as advan- 
tages over conventional low frequency radio telescopes in terms 
of calibration. Unlike the WSRT, antennas on the ground are 
much less susceptible to broadband RFI. On the other hand, cal- 
ibration of a LOFAR observation is challenging due to many 
reasons including spatially and temporally varying beam shapes 
with wide fields of view as well as mild to severe ionospheric 
distortions. Therefore, it is paramount that we test and demon- 
strate the feasibility of LOFAR for EoR observations before 
starting the long, dedicated observing campaign. 

The results reported in this paper are based on integrations 
of the NCP which consisted of 3 nights with 6 hours each night- 
time observing. We provide details of the calibration and imag- 
ing that lead us to almost reach the expected theoretical noise 
level (within a factor of 1.4). We also provide details of current 
limitations and what we expect with the current, still incomplete 
LOFAR. Based on this result, we see no show stoppers for the 
launch of the dedicated LOFAR EoR observations on this win- 
dow which will last several hundred hours. 

This paper is organized as follows: In section |2] we give 
an extensive overview of the observational setup. Section |3] de- 



scribes the data processing pipeline. In section|4] we present ini- 
tial results with deep very widefield images, new sources, and 
the noise behaviour We give an analytical explanation for the 
noise behavior in section |5] that considers the excess noise due 
to sources far away from the phase center Finally, we draw our 
conclusions in section|6] 

Notation (mostly in section|5]i: We use bold lowercase letters 
for vectors and bold uppercase letters for matrices. The matrix 
Frobenius norm is given by ||.||. The matrix transpose, Hermitian 
transpose and pseudoinverse are given by (.)^, (.)^ and (.) ' , re- 
spectively. The trace of a matrix is given by trace(.). The real 
part of a complex number is denoted by Re(.). The statistical 
expectation operator is denoted by £{.). 

2. Observational Setup 

In this section we provide details of the LOFAR stations used 
in the NCP observations. We also provide the motivation behind 
observing the NCP. 

2.1. LOFAR stations 

We give a brief overview of LOFAR hardware and a com- 
plete overview can be found in Ivan Haarlem et aL and 
Ide Bruvn et"ai1 (|2013|) . Each LOFAR HBA station consists of 
multiple elements (dipoles) with dual, linear polarized receivers. 
For a core station (CS), there are 384 dipoles, arranged in 24 tiles 
that have dipoles on a 4 x 4 grid. For a remote station (RS), there 
are 768 dipoles, arranged in 48 tiles. The signals of each dipole 
in a tile are coherently added (or beamformed) to form a narrow 
field of view (FOV) along a given direction in the sky. The effec- 
tive beam shape is the product of the array (beamformer) gain 
with the dipole beam shape. It should be noted that the dipole 
beam shape is strongly polarized, and along some directions in 
the sky the polarization could be as much as 20% of the total 
intensity. 

The nominal LOFAR FOV at around 150 MHz at the NCP 
(from null to null) is about 1 1 degrees in diameter for a CS and 
about 8 degrees in diameter for a RS. Therefore, the effective 
FOV is about 10 degrees in diameter There is also a compli- 
cated low level sidelobe pattern surrounding the FOV, and the 
sidelobes change with time and frequency, as the beamformer 
weights change. In order to minimize the cumulative effect of 
the sidelobes, each LOFAR station is given a different rotation 
in its dipole layout (but keeping the dipoles parallel). Due to this 
reason, each LOFAR station will have a unique beam shape, that 
varies in time, frequency as well as according to the direction 
being pointed at. For a widefield image, this naturally leads to 
beam variations that depend on time, frequency as well as the 
direction in the sky. 

2.2. The NCP and its surroundings 

It is of significant importance that the NCP FOV lies on a rela- 
tively cold (i.e., having low sky temperature) spot in the Galactic 
halo in order to minimize the effects of Galactic foregrounds. In 
Fig. [1] we show the NCP window (or FOV), which is overlaid 
on a full sky image observed at 50 MHz. The full sky image was 
made using an array of 16 LOFAR lowband dipoles, with the 
longest baseline of 450 m. 

The Galactic plane lies along an arc joining Cassiopeia A 
(CasA) and Cygnus A (CygA) in Fig.[T]but it is resolved. CasA is 
32 degrees away from the pole while CygA is about 40 degrees 
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to the beam gain along the direction of the NCP (which is unity) 
and constant in time. However, the element (dipole) gain varies 
and this is taken into account during calibration. 
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Fig. 1. NCP FOV overlaid on a full sky image at 50 MHz. The 
full sky image is made using the LOFAR lowband dipoles. 



away. The closest 3C source to the pole is 3C61.1 which is 4 
degrees away, with a total flux about 35 Jy (peak about 7 Jy) at 
150 MHz. It is fully resolved and an image made from a previous 
LOFAR observation is shown in Fig. |2] There are several other 
bright 3C sources in the vicinity including 3C390.3 (11 degrees 
away). 
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Fig. 3. Radio spectrum of NVSS JOl 1732+892848 compiled 
from data in the literature (iRengelink et al.l Il997l ICohen et al.l 
I2007h . 
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Fig. 2. 3C61.1 model image at 150 MHz. The two hot 
spots in this source, a giant double radio galaxy at z=0.188 
dLawrence et al.l[T996l) . have peak flux densities of about 7 Jy 
in an 8" PSF. The colourbar units are in Jy/PSF. 



Due to the station beams, many of the strong sources outside 
the main beam are suppressed significantly and the brightest (ap- 
parent) source in the FOV is NVSS JOl 1732+892848, about 30' 
away from the pole. This source is unresolved in our o bserva- 
tions and has a peak flux of 5.4 Jy at 352 MHz (Rengeli nk et alJ 
Il997h and a peak flux of 5.3 Jy at 74 MHz ( ICohen et al]l2007l) 
as reported in the WENSS and VLSS surveys, respectively. 
Therefore, we assume this source to have a flat spectrum within 
the observing band as in Fig.|3] which makes it a suitable candi- 
date for absolute flux calibration and noise estimation. Because 
it is very close to the pole, the nominal station beam gain is equal 



2.3. Motivation for observing the NCP 

The NCP is one of several observational windows for LOFAR 
EoR observations (Ide Bruvn et al.l l2"013h . The reasons behind 
choosing the NCP are numerous although this does not imply it 
to be the optimal choice. We list some of the positive and some 
of the potentially negative aspects of this choice: 



+ The geographical location of LOFAR makes the NCP win- 
dow observable at night time, throughout the year, at high 
elevation (53 degrees). 

+ Due to minimum projection effects of the uv tracks, we get 
almost circular uv coverage and therefore, we get an almost 
circular point spread function (PSF). 

± The strongest source in the FOV, 3C61.1 is attenuated by 
about 70% and the strongest (apparent) source is about 5 
Jy in peak flux. Not having a strong source in the FOV has 
both advantages and disadvantages. First, not having a strong 
source means a not so high signal to noise ratio (SNR) in 
calibration. However, there are less artefacts resulting from 
deconvolution residuals of strong sources. 

± Geostationary RFI that manages to escape flagging routines, 
which work on high-noise samples, may end up near the 
North Celestial Pole. Nonetheless, this provides a sensitive 
diagnostic of the presence of any faint, stationary, undetected 
RFI. 

- Finally, the NCP is located at a Galactic latitude of only 38 
degrees. The overall system noise is therefore higher than the 
coldest regions near the North Galactic Pole. 
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2.4. Observational parameters 

We used 40 core stations and 7 remote stations in our observa- 
tions. The shortest baseline is 60 m and the longest baseline is 
about 30 km. The observing frequency range is from 115 MHz 
to 163 MHz. There are about 240 subbands in each observa- 
tion, within this observing frequency range. Each subband has 
256 channels, covering a bandwidth of 195 kHz. The monochro- 
matic uv coverage for a typical 6 hour NCP observation is shown 
in Fig. m 
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Fig. 4. 6 hour monochromatic uv coverage at the NCP, using 40 
core stations and 7 remote stations. The longest baseline is about 
30 km. The two (red and blue) colours show the symmetric uv 
points obtained by conjugation of the data. 



256 channels at 2 s integration) is flagged u sing the aoflagger 
(lOfl^ringa et al.l 120101: lOffringa et al.ll2012h and averaged to 15 
channels (after removing the 8 channels at both subband edges, 
mainly to remove edge effects from the polyphase filter). This 
significantly reduces the size of the data that has to be processed 
in the following stages. However, with improvements in soft- 
ware, we intend to process data at a finer frequency resolution, 
once regular EoR observing has begun. 

3.2. Sky model 

Because there is no single bright source in the FOV, the sky 
model used for initial calibration of the NCP data contains about 
three hundred discrete sources, spread across the FOV of about 
10x10 square degrees. The most complex source in this region 
is 3C61.1 as shown in Fig.|2] In order to efficiently model this 
source, we use a model including shapelets ( lYatawattal20I Ih and 
point sources. The rest of the sky model is modeled as a set of 
discrete sources, having multiple point source components. The 
brightest source (NVSS JOl 1732+892848) is modeled as a sin- 
gle point source with a flat spectrum and a peak flux of 5.3 Jy. We 
like to emphasize here that we diverge from the traditional ' clean 
component' based sky model construction in order to minimize 
the number of c omponents used without the loss of accuracy 
(lYatawattal2OI0l) . In order to automate this process, we have de- 
veloped custom software (buildsky) that creates a sky model 
with the minimum number of source components required. The 
principle behind buildsky is to select the simplest model for a 
given source by choo sing the correct number of degrees of free- 
dom (lYatawattal201 ih . While a point source has only one degree 
of freedom (for its shape), a double source has two and so on. 
There are additional degrees of freedom due to its p osition and 
flux. W e use information theoretic criteria as given in lYatawattal 
( 1201 lb to select the optimum number of degrees of freedom for 
any given source. All the sources in the sky model are unpo- 
larized. The sky model was updated using two calibration and 
imaging cycles. 



With this uv coverage, we get a resolution of about 12" at 
150 MHz. For comparison, in pr evious NCP observations using 
the WSRT jBernardi et al.ll2010h . the longest baseline used was 
only 2.7 km yielding an angular resolution of only 120". The 
correlator integration time is set at 2 s. We used data taken on 3 
different nights for the results presented in this paper In TablelT] 
we summarize the observational parameters for these 3 different 
nights. 

3. Data Reduction 

In this section, we describe the major steps taken to calibrate the 
NCP observations. Apart from the initial processing, the data 
was completely processed in a cpu/gplQ cluster dedicated for 
LOFAR EoR computing needs. The processing of these obser- 
vations also enabled us to fine tune the software used in various 
processing steps. 

3.1. Initial processing 

The LOFAR correlator (iRomein et al.ll20l"ol) outputs data at a 
very fine resolution (2 s and 0.78 kHz), mainly to facilitate RFI 
mitigation. However, the data volume makes it cumbersome for 
further processing. Therefore, the data of each subband (having 

' CPU: Central Processing Unit, GPU: Graphics Processing Unit 



3.3. Calibration 

The aim of calibration of LOFAR EoR observations is twofold; 

(i) correction for instrumental and ionospheric errors in the data 

(ii) removal of strong foreground sources from the data such that 
specia lized foreground removal algorithms (e.g. iHarker et al.l 
(l2009l) ) can be applied. The basic descripti on of the LOFAR EoR 
data model used in calibration is given b v iLabropoulosI (|20 1 Ol) . 
We use an enhanced version of the LOFAR calibration pipeUne 
dPandev et al.ll2009h for the EoR data calibration. 

3.3.1. Data correction 

Major steps in our calibration pipeline are as follows: 

1. We first calibrate for clock errors as well as small time 
scale ionospheric errors along the center of the FOV. This 
is the so called uv plane or direction independent calibration 
{Labropoulos 2010) and is performed using the Black Board 
Selfcal (BBS) package dPandev et al.l [2009). At this stage, 
each subband has 15 channels at 2 s integration time. We de- 
termine the calibration solutions for every 10 s, and one solu- 
tion per subband. Since we do not have a dominant source at 
the center of the FOV, the solutions thus obtained correspond 
to small time scale ionospheric phase fluctuations common 
to the full FOV plus the clock errors. 
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Observation No. 


Start Time (UTC) 


No. Stations 


No. Subbands 


Noise (pJyPSF-') 




(duration 6 iiours) 


(delivering good data) 


(processed) 




L24560 


27-March-201 1 20:00:05 


45 


229 


125 


L25085 


lO-April-2011 20:00:05 


43 


185 


255 


L26773 


19-May-2011 19:30:00 


41 


187 


224 



Table 1. Summary of observational parameters (and noise level achieved with uniformly weighted images, at the edge of the FOV) 
for the 3 nights of data taken. The variability of the noise is due to variability in the sensitivity of the stations, some of which were 
less focused due to beamforming errors at the time of the observations. 



Once uv plane calibration is done, the data is corrected for 
these errors. The data is also corrected for the element beam 
gain along the center of the FOV. As discussed previously, 
the dipole beam of LOFAR is strongly polarized and we use 
an element beam model based on numerical simulations. For 
an area about 10 degrees in diameter in the sky, the variation 
of the dipole beam shape is assumed to be small and correc- 
tion for the center of the FOV is considered accurate enough 
for the full FOV. 

The corrected data is averaged to 183 kHz (one channel per 
subband) and 10 s integration time. The data is also flagged 
by clipping any spikes present in the data after correction. 




3.3.2. Source subtraction 

LOFAR has a very wide field of view and along each direc- 
tion, the errors present in the data are dif ferent due to vary - 
ing beam shape and ionospheric effects ( iKoopmansI l2010h . 
Therefore, source subtraction is not a simple deconvolution 
problem for LOFAR observations. Even for a simple deconvo- 
lution, i t is better to sub tract the sources directly from the visi- 
bilities (I Yatawattal20 1 Ol) . In the case of LOFAR, this subtraction 
has to be done with the appropriate gain corrections along each 
direction. 

In order to efficiently and accurately solve the multi-source 
calibration problem, we have developed algorithms and soft- 
ware based on Expectation Maximization (fv^tawatta et al.l2009l : 
iKazemi et all 1201 Ibt lYatawatta et alj |2012|) . We have imple- 
mented these algorithms (SAGECal) with accelerated process- 
ing using graphics processing units (GPUs). In the NCP win- 
dow, there are about 500 bright discrete sources (note that we 
subtract more sources than what we use for the uv plane cali- 
bration, for which we only use about 300) that are subtracted 
from calibrated v isibilities with the cor rect directional gains. We 
have 'clustered' (iKazemi et al. these sources into about 

150 different directions. Thus we effectively determine the eiTors 
along 150 directions during the source subtraction. An example 
of clustering is shown in Fig. |5] In the left panel of Fig. |5] we 
show two sources that are about 5' apart and apparently having 
identical error patterns. Therefore, instead of calibrating along 
each source individually, we can cluster them into one complex 
source and determine the common eiTors. The right panel in Fig. 
|5] shows the result after subtracting the cluster and restoring the 
model. 

There are still some errors remaining in the right panel of 
Fig. |5] mainly because of errors in the sky model and due to 
the effect of surrounding sources (that were not included in the 
sky model) and also due to short time scale ionospheric er- 
rors. It should also be mentioned that while most of the sources 
subtracted lie within the FOV, we have also subtracted strong 
sources far away from the NCP as shown in Fig. [1] for example 
CasA. 



jj.j^'- •.".j mai J/.w.^ laa •.w.f. 

Fig. 5. Clustering of two sources 5' apart. The left panel shows 
the two (point) sources with identical error patterns. The right 
panel shows the image made after determining a common error 
(at an interval of 20 minutes) for both sources and subtracting 
their contribution from the data. The sky model has been restored 
in the right panel. The colourbar (bottom) units are in Jy/PSF. 



3.4. Imaging 

We make images at different stages during calibration. All im- 
ages are made using CASA0. In order to update the sky model, 
we make images of the calibrated and source subtracted data. 
We keep the highest available resolution in order to create ac- 
curate source models. For the results presented in this paper, we 
have a resolution of about 12" and we choose a pixel size of 4" 
with uniform weighting. Even though the nominal FOV is about 
10 degrees, we make images that have an FOV of about 13 de- 
grees, to detect sources at the edge of the beam. We restore the 
subtracted sources onto these images, after convolving with the 
nominal (Gaussian) PSF. Afterwards, we use Duchamp dWhitind 
120 12h to select areas with positive flux and update the sky model 
using buildsky. 

More relevant for EoR signal detection are images made at 
low resolution using the short baselines of LOFAR. Therefore, 
we also make images using baselines less than 1200 wavelengths 
at 35" pixel size. The image size is chosen to be about 65 degrees 
so that we can see any contributions from the Galactic plane, 
which is about 30 degrees away from the NCP. 

4. Results 

In this section, we present results mainly to highlight important 
stages in the calibration and finally, to present the noise limits 
that we have reached using LOFAR. The results are based on 
all three datasets given in Table [1] unless stated otherwise, and 

- Common Astronomy Software Applications, |http://casa.rrrao.edu| 
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continuum images are made using inverse variance weighted av- 
eraging of the 240 subband images. 

4.1. The performance of SAGECal in directional calibration 

The effects due to beam shapes and the ionosphere are major 
causes of errors in LOFAR images. Therefore, directional cal- 
ibration is essential. We present a few images to highlight the 
performance of SAGECal. 

In Fig. |6l we present the area around the brightest source 
NVSS JOl 1732+892848. The image on the left is before multi- 
directional calibration and source subtraction, with only a deep 
CLEAN based deconvolution. The image on the right in Fig. |6] 
is after running SAGECal and after restoring the sky model onto 
the residual image. It is clear that the errors due to beam varia- 
tions and ionospheric variations have largely been eliminated in 
the right panel of Fig. |6] Some errors are still present, mainly 
caused by inaccurate source models used in multi-directional 
calibration. We emphasize that longer baselines are needed to 
construct accurate models for such complex sources. 

In Fig.|7] we give another example for the effect of the time 
interval chosen in SAGECal for multi-directional calibration. The 
image on the left is without any multi-directional calibration and 
significant errors due to beam shape and ionosphere are visible. 
The image in the middle of Fig.|7]shows the image obtained after 
running SAGECal with directional calibration performed every 
20 minutes. The beam variations, which are slower, are com- 
pletely eliminated by this procedure. However, the ionospheric 
variations, that could have time scales much less than 20 minutes 
are still present. 

The right panel of Fig. |7] shows the result after running 
SAGECal with a 'hybrid' solution scheme. In this case, we solve 
for bright sources once every 5 minutes and for fainter sources 
once every 20 minutes. Most of the ionospheric errors present in 
the middle panel have been removed in this figure. There are still 
errors due to inaccurate source models (the source was assumed 
to be a perfect point source, but this is not accurate enough) and 
also due to ionospheric phase variations with a time scale smaller 
than 5 minutes. 

The time interval of 20 minutes chosen for obtaining the so- 
lutions gives us 120 time samples (each sample is of 10 s du- 
ration). For each time sample, we have 990 baselines with 45 
stations. Therefore, we have about 8x120x990=1 million real 
constraints to obtain a solution. The number of real parameters 
in a solution is 45x150x8=54000 for 150 directions in the sky. 
Therefore, the ratio between the number of constraints and the 
number of parameters is about 18 which is more than sufficient 
to obtain a reliable solution. This can be further impro ved by us- 
ing da ta points at different frequencies as proposed bv lBregmanI 
(I2012h . 

4.2. Widefield images 

We present widefield images obtained for the full dataset given 
in Table [T] First, in Fig. [8] we present the image obtained af- 
ter calibration as described in section 13.3.11 but before run- 
ning SAGECal. Therefore, no source subtraction is performed 
and only traditional CLEAN based deconvolution has been ap- 
plied. The circle indicates an area of diameter 10 degrees. The 
peak flux of this image is about 5.3 Jy and the noise level is 
about 400/iJy/PSF. The complex source 3C61.1 is at the bot- 
tom left hand comer The striking features in this image are the 
artefacts surrounding almost every source. As described previ- 



ously, there are three major reasons for these artefacts; (i) vary- 
ing LOFAR beam shapes which are different for each station, 
(ii) ionospheric phase errors, and (iii) classical deconvolution 
errors due to having partially resolved sources. For instance for 
the case of 3C6 1 . 1 , all three of the aforementioned causes create 
artefacts, which are clearly visible close to the bottom left hand 
corner 

The only way to improve the image in Fig. [8] is multi- 
directional calibration as described in section 13.3.21 We have 
shown the image obtained after running SAGECal in Fig.|9l The 
circle indicates an area of diameter 10 degrees. Comparison of 
Figs.[8]and|9]shows that most significant artefacts in Fig.[8]have 
been eliminated in Fig. |9] The prominent artefacts that still re- 
main are due to the fact that CS and RS beam shapes have differ- 
ent FOVs and also due to frequency smearing. We now reach a 
noise level of about 100 yuJy/PSF at the outskirts of Fig.|9] while 
the peak value in the image is about 5.3 Jy. This corresponds to 
a formal dynamic range of 50,000: 1 . 

In Fig.[TOl we give the image made only with the short base- 
lines (< 1200 wavelengths) using the same data of Fig. |9] In 
this image, the circle shows an area with 10 degrees in diame- 
ter and the density of the sources close to this circle is clearly 
less than in other areas of the image. We also see a significant 
number of sources away from the FOV that are seen through 
sidelobes of the beam. Most of these sources have not been in- 
cluded in our multi-directional calibration and hence, they have 
significant artefacts. The noise level in this image is about 300 
/iJy/PSF with the peak flux of about 5.3 Jy. Note, however, that 
the noise level is a strong function of the distance from the field 
centre. 

4.3. New sources 

Since we reach a noise limit of about 100 //Jy/PSF, we detect a 
large number of sources that have not been detected in previous 
observations, even at higher frequencies. In Fig. [TT] we present 
a small area (0.6x1.0 degrees) of Fig.|9]to compare to an image 
fromWENSS. 

We present an area close to the NCP in Fig. [12] The left panel 
in Fig. [12] shows an image made with all baselines which gives 
a PSF of 12". The right panel in Fig. [T2] shows an image made 
using only the core baselines and has a PSF of about 150". An 
important lesson that can be drawn from Fig. [T2]is the total ab- 
sence of artefacts close to the pole. If there would be any residual 
geostationary RFI, their effects would accumulate near the pole. 
However, we see no unexplained artefacts. 

4.4. Effects of bright sources at large angular distances 

As shown in Fig.[T] there are a few bright sources in the neigh- 
borhood of the NCP. We have already mentioned 3C61.1 which 
is still well inside the FOV. The other source that has a signif- 
icant effect is CasA, which is about 30 degrees away from the 
NCP. In Fig. [13] we present the images around CasA, made with 
only the core station baselines. The images with baselines us- 
ing core stations only are more affected by CasA than images 
that include remote stations. There are at least four reasons that 
contribute to this. First, core stations have wider station beams 
(compared to a remote stations) and therefore, CasA is less at- 
tenuated on core-core baselines . Secondly, the core stations have 
more short baselines than remote stations, hence see more flux 
from CasA, which is heavily resolved at baselines longer than 
lOOO/l. Thirdly, time and frequency smearing lead to a signif- 
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Fig. 6. An area (about 0.5 deg x 0.5 deg) close to the NCP before and after multi-directional calibration using SAGECal. The 
image on the left is before running SAGECal and after a deep CLEAN deconvolution and errors due to beam variation and (some) 
ionospheric variations are clearly visible. On the right hand image, the sources are subtracted directly from the visibility data and 
the sky model is restored onto the residual image. Most of these errors visible in the left hand image are eliminated in the right 
hand image as CLEAN based deconvolution fails to consider the directional errors into account. The peak flux is 5 Jy/PSF and the 
colourbar units are in Jy/PSF. 
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Fig. 7. The performance of SAGECal with different solution intervals. The image on the left is without multi-directional calibration. 
The image in the middle is after running SAGECal with a solution interval of 20 minutes. The image on the right is after running 
SAGECal with a hybrid solution interval, where solutions are obtained along bright source clusters at every 5 minutes and for fainter 
source clusters, every 20 minutes. It is clear that the small scale ionospheric errors present in the middle figure are mostly eliminated 
in the right panel. However, ionospheric variations due to decorrelation eflfects within the 5 minute interval are still present on the 
right panel. The colourbar (bottom) units are in Jy/PSF. 
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Fig. 8. The NCP image after calibration, but before running SAGECal, which has also been deconvolved using CASA. The circle 
indicates an area of diameter 10 degrees. The source 3C61.1 is at the bottom left hand corner The image has 7200 x 7200 pixels of 
size 4" with a PSF of 12" and the noise level at this stage is still about 400 /^Jy/PSF. The colourbar units are in Jy/PSF. 



icant attenuation of the visibilities of distant sources. Fourthly, 
ionospheric effects such as non-isoplanaticity rapidly increase 
with the length of the interferometer baseline. 

In Fig. [13] left panel, we show an image where we have run 
SAGECal while ignoring the effect of CasA. In other words, we 
did not include CasA in our sky model and neither did we solve 
along the direction of CasA. CasA is at the bottom of this im- 
age and is heavily distorted due to beam and ionospheric errors 
as well as time and frequency smearing. The ringlike structures 
centered at CasA and spreading throughout the image is clearly 
visible. However, there is an area shaped like a cone that is di- 
rected towards the NCP where there are no ripples. This is due 
to the fact that multi-directional calibration that ignores CasA 
(mainly close to the NCP) has absorbed the effect of CasA di- 
rected towards the NCP. Similar effects can be seen in sequential 
source subtraction schemes such as 'peeling'. 



In the right panel of Fig. [13] we have included CasA in our 
multi-directional calibration using SAGECal. Most of the ripples 
in the left panel are eliminated in the right panel of Fig. [TJI 
Furthermore, the noise level is reduced by a few percent after in- 
cluding CasA in the calibration. There are still some errors close 
to the location of CasA. This is mainly due to errors in the CasA 
source model used in the subtraction a nd we expect to get be tter 
results with an updated source model (lYatawatta et al.ll2"012l) . 



Because CasA is a strong source, we clearly see its effect 
as we have just described. Conversely, even faint sources would 
have a similar effect, albeit at a low magnitude. We perform a 
statistical analysis of this effect due to faint sources in section|5] 
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Fig. 9. The NCP image after multi-directional calibration and source subtraction using SAGECal. After a shallow deconvolution 
using CAS A (mainly to estimate the PSF), the skymodel is restored onto the image. The circle indicates an area of diameter 10 
degrees. The image has 12000 x 12000 pixels of size 4" with a PSF of 12" and the noise level is about 100 |/Jy/PSF. Due to the fact 
that RS and CS beam shapes have different FOVs the sources at the edge of the image are distorted. In addition, due to frequency 
smearing, the sources at the edge of the image appear 'attracted' towards the center The colourbar units are in Jy/PSF. 
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Fig. 10. The NCP image after multi-directional calibration and source subtraction using SAGECal, using only the short baselines. 
After a shallow deconvolution using CASA (mainly to estimate the PSF), the sky model is restored onto the image. The circle 
indicates an area of diameter 10 degrees. The image has 2000 x 2000 pixels of size 35" with a PSF of 150" and the noise is about 
300 juJy/PSF. The colourbar units are in Jy/PSF. 



10 



S. Yatawatta et al.: Initial deep LOFAR observations of EoR windows 





Fig. 11. Comparison of a small area of the NCP image, size 0.6 x 1.0 degrees with WENSS. The left panel shows the image from 
WENSS (PSF 60") while the right panel shows the equivalent image made using LOFAR (PSF 12") after running SAGECal and a 
shallow deconvolution. The colourbar units are in Jy/PSF. Many more sources, at much higher angular resolution can be detected. 
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Fig. 12. Images of the area close to the NCP. The image on the left panel is using all baselines and has a pixel size of 4" and a PSF 
of 12". The image on the right is using core only baselines with a pixel size of 35" and a PSF of 150". The colourbar units are in 
Jy/PSF 



4.5. Noise 

One important question that needs to be answered is whether we 
have reached the theoretical noise limits in our images and if not, 
provide plausible explanations for t he difference. The th eoretical 
noise can be calculated as follows dNiiboer et al.ll2009h : 

Win,aging 

nOlSCsubband = , Z. (1) 

A A A ( N,m-l) , A'.A',- , N,.(N,.-l) \ 

In ([1]), the noise per subband (bandwidth 183 kHz) is given by 
noisCsubband whilc the system equivalent flux densities for core 
and remote stations are given by 5 c and S respectively. We as- 
sume Sc - 3360 Jy and S r - 1680 Jy, taking the proximity to 
the Galactic plane into account. The total bandwidth and integra- 



tion time are considered as A/=183 kHz, and A,~6x3600 s. The 
total number of core and remote stations are given by A^^. and A^r- 
The scale factor Wimaging is used to take imaging weights into ac- 
count and we take this equal to 2 for uniform weighting. For the 
observation with highest sensitivity from Table[T](L24560), with 
Nc = 38, Nr = 7, we get noisCsubband = 1-46 mJy. Moreover, for 
the full observation, with 229 subbands, and taking the variation 
of S c and S with frequency into account, we get a theoretical 
noise value of about 93 fi3y. As seen in Table [T] we are only a 
factor of 1 .4 from reaching the theoretical noise limit. An impor- 
tant question is how to estimate the noise in the images, in other 
words, which area of the image to use to calculate the noise. In 
Fig. [14] we show the variation of the noise across the 13x13 de- 
grees image shown in Fig. |9] As expected, the noise close to the 
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Fig. 13. Images of CasA, which is about 30 degrees away from the NCP, made only using the short baselines (< 1200 wavelengths). 
The distorted image of CasA (due to smearing and directional errors) is at the bottom of these figures. The image on the left is 
obtained after running SAGECal without taking CasA into account. The image on the right is after adding CasA to the sky model 
and running SAGECal. It is clear that the ripples on the left panel are eliminated on the right panel. Moreover, the left panel shows a 
'cone' directed towards the NCP where the ripples are absent. The colourbar units are in Jy/PSF. 



NCP is higher and is about 180 /iJy due to additional contamina- 
tion by unsubtracted compact sources as well as by diffuse fore- 
grounds. However, with the subtraction of many more sources 
as well as foregrounds, we expect to reach the noise that we see 
at the edge of the image in Fig. [141 Therefore, we study the be- 
havior of the noise seen at the edge of this image as this is what 
we intend to reach. We emphasize that the achieved continuum 
noise levels in low resolution images, i.e. made using only the 
core stations, will be limi ted by classical confu sion noise in the 
inner p arts of the images. iBernardi et al.l (l2009b and iPizzo et alj 
(l201ll) estimated this to be about 0.6 mJy for the WSRT at a 
frequency of 140 MHz. Because the LOFAR core has a similar 
extent as the WSRT we expect this to be the asymptotic limit of 
the continuum noise level in core-only LOFAR observations. 

To study the noise behavior in more detail, we provide sev- 
eral figures where we give the noise (or the standard deviation) 
of a small rectangular area in the images (about 4 degrees away 
from the NCP) at different frequencies and using different ob- 
servations listed in Table [1] In Fig. [15] we present noise lev- 
els per subband (standard deviation), determined with images 
made using the three nights with imaging parameters as in Fig. 
|9l Additionally, we also show the noise estimated at the NCP in 
Fig- HI The main conclusion that can be drawn from Fig.|9]is the 
variability of the noise (or the sensitivity) of LOFAR from night 
to night. This is due to some stations not working or not working 
at full sensitivity at different nights. We expect this to be much 
more stable before the commencement of dedicated EoR obser- 
vations. 

The best night in Fig. [15] is for the observation number 
L24560 of Table [T] which has a noise level of about 2 mJy per 
subband at the high frequency end. There is a steep rise in the 
noise level at frequencies below 130 MHz. This we attribute to 
the rising contribution of Galactic background noise, increased 
sidelobe noise from an increasing number of faint background 




-6 -4 -2 2 4 

Angle from NCP/deg 



Fig. 14. Variation of noise across the image shown in Fig. [9] The 
noise is estimated using a moving rectangular window of about 
400 X 400 pixels and the noise is only calculated when there are 
no sources (> 0.5 mJy) within the rectangular window. The noise 
variation is shown across a 13x13 degrees FOV. 



sources (due to both a wider primary beam and steeper source 
spectra), emission from the Galactic plane and other very bright 
sources like CygA (see section O. There are also some spikes 
and dips in the noise curves where RFI removal has flagged sig- 
nificant amounts (more than 30%) of data. We have discarded 
such images from further analysis. 

In Fig. [16] we give the image difference noise estimates for 
the best night (L24560) of Table [1] The image difference noise 
is calculated by (i) subtracting two images adjacent in frequency 
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Fig. 15. Noise in images made using three nights of data. The 
noise of each image is estimated at a location at the edge of the 
FOV as well as at the NCP, using a rectangular window of about 
400 X 400 pixels. The three different colours correspond to the 
three different nights listed in Table [1] 



(ii) estimating the standard deviation of the subtracted image (iii) 
multiplying the standard deviation by 1 / yjl. We have also plot- 
ted the absolute noise level in the same figure. It is clear from 
Fig. [16] that the absolute and differential noise curves are al- 
most identical. This suggests that our absolute noise estimate 
does not contain any frequency independent systematic effects, 
except systematic effects that are uncorrected between adjacent 
images. Future processing would take the difference at narrower 
frequencies to verify this. 



have averaged B images (or subbands) together and calculated 
the B difference between images. We have done this for B - 
2, 4, 8 in Fig. [16] As B increases, the differential noise should be 
more affected by systematic effects due to the large bandwidth 
of averaging. This, in turn should be reflected by the differential 
noise not decreasing as 1 / Vb. However, we do see a decrease 
by 1/ Vb in the plots in Fig. [16] even when B = 8. 

We also show the noise comparison between the images 
made with all baselines and with the images made with core only 
(< 1200 wavelengths) baselines in Fig. [17] The corresponding 
continuum image is given in Fig. [TO]where the imaging param- 
eters are also given. We have estimated the noise of image with 
core only baselines at a location about 5 degrees away from the 
center of Fig.[TOl 




135 140 145 
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Fig. 17. Comparison of noise in images made with all baselines 
(blue) and core only (< 1200 wavelengths) baselines (red). 
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In Fig. [18] we give the image difference noise plots for im- 
ages made only with core baselines. We have also plotted the 
absolute noise curve which agrees well with the image differ- 
ence noise. Similar to Fig. [16] we have also averaged adjacent 
subbands of groups 2,4 and 8 and found the differential noise 
between averaged images. 

In Fig. [19] we show the noise plots for all 4 Stokes images 
(I,Q,U,V) for one night (L24560) of observed data. The noise 
level of Stokes 1 is significantly higher which we attribute to un- 
subtracted sources. We give a detailed analysis of this in section 

m 



115 120 125 130 135 140 145 150 155 160 
Freq/MHz 

Fig. 16. Image difference noise of images made using one night 
(L24560) of data. The noise of each image is estimated using a 
rectangular window of about 400 x 400 pixels, after taking the 
difference of images adjacent in frequency. We have also aver- 
aged images adjacent in frequency into groups of 2,4 and 8 and 
have taken their difference as well. 



We have also plotted in Fig.[T6]image difference noise curves 
where instead of taking the difference of adjacent subbands, we 



4.6. Linear polarization 

Due to the fact that we correct for the element (dipole) beam 
polarization along the direction of the NCP during calibration, 
we do not see substantial instrumental polarization in our Stokes 
Q,U and V images. Note also that we expect most of the sources 
to be intrinsically unpolarized in this frequency range. Even 
though we only correct for the element beam gain along the di- 
rection of the NCP, the relative variation within the full FOV of 
the dipole beam shape is very little (therefore this correction is 
satisfactory within the full FOV). However, we expect to see an 
increase in instrumental polarization at the edge of the FOV but 
the station beam attenuation makes this instrumental polariza- 
tion less obvious. During the multi-directional calibration phase 
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using SAGECal, we also solve for a full Jones matrix and there- 
fore, what remains of this instrumental polarization is correctly 
subtracted. 

In order to detect any diffuse Galactic foregrounds that might 
appear in polariz ed images, we have performed rotation measure 
(RM) synthesis dBrentiens & de Bruvnll2005h using data from 
one of the nights listed in Table [T](L24560). We show the to- 
tal polarized intensity image for RM=0 rad m"^ in Fig. |20] The 
noise in this image is about 110 ;uJy/PSF and apart from many 
weakly instrumentally polarized discrete sources, we detect only 
very faint diffuse structure, at any value of RM. However, we 
have seen that subtraction of compact sources using SAGECal 
would suppress the unmodeled diffuse foregrounds. To allevi- 
ate this from happening in future processing, we would ignore 
the contribution from short baselines while running SAGECal for 
calibration along multiple directions. 



Fig. 18. Image difference noise of images made with core only 
(< 1200 wavelengths) baselines for one night (L24560) of data. 
We have also plotted the absolute noise which agrees well with 
the differential noise. We have also averaged images adjacent in 
frequency into groups of 2,4 and 8 and have taken their differ- 
ence as well. 




160 



Fig. 19. Noise in all four Stokes images for one night (L24560) 
of observed data. The images are made with core only (< 1200 
wavelengths) baselines. At the high frequency end, the noise in 
total intensity is about 1.6 times higher than the noise in polar- 
ization. 



5. Effect of Outlier Sources in Image Difference 
Noise 

In this section, we present an analysis of the contribution of 
sources far away from the field center to the (differential) noise 
of images made at the NCR In fact, this analysis can be extended 
to any interferometric observation. We show that our ignorance 
of these sources indeed act as an additional source of noise. 

Consider an elementary interferometer The visibility 
V{up, Vp, Wp) at coordinates Up, Vp, Wp on the uv plane is 

V(up, Vp, Wp) = 

f fs{l TO)e-i2'^7(",.'+'V'«+''V(Vi^P^-i))_^f^^_ (2) 
J J ' Vl - P - m2 

where S (/, m) is the sky flux density and /, m are the direction 
cosines. The frequency of the observation is / while the speed 
of light is c. We assume the sky to consist of a set of discrete 
sources, and we arrive at 



V(Up, Vp, Wp) = 2j '"9)^ ^ y 1 1 , 



(3) 



where 1(1, m) is the sky intensity. 

Considering a bandwidth of A for smearing, we have the 
smeared value of V{Up, Vp, Wp) around frequency fit as 

I /-/o+A/2 

V(up, Vp, Wp) = - I V{up, Vp, Wp)df (4) 

A J/o-A/2 

and assuming I{lq, nig) variation is small over this bandwidth, 
this reduces (O to 

(5) 
(6) 



V(up, Vp, Wp) = 



1 

X sine 



[llplq + VpTHq + W p{ ^l-l^- - l)jj 



Let M denote the number of samples in the uv plane and 
denote the number of sources in the sky. We represent the 
visibilities at all points on the uv plane in vector form as b (size 
M X 1) and the intensities of all discrete sources in the sky as b 
(size N X I) where 



(7) 





V{ui,vi,wi) 




■ /(/i,mi) - 


b = 


V(U2,V2,W2) 


, b = 


lih^m) 




- V(um, Vm, Wm) . 




_ I{In, mn) . 
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Fig. 20. Total polarized intensity image at RM=0 rad m ^ using core only baselines (150" PSF), after running SAGECal. The circle 
indicates an area of 10 degrees in diameter The noise is about 110 /iJy/PSF. The colourbar units are in Jy/PSF. 



We can relate b and b as 

b = T/„b (8) 
where the elements in the matrix T /„ (size M x A^) are given as 
[T/o]p, = (9) 

^-j27r^^u„l,i+v„m^+Wi,(-\/l-ll-ml-l)j 

: {uplq + VpTUq + Wp{ ^1 - /2 - ,7^2 _ _ 



e 

X smc I 



Let D be the number of pixels of the image in which the noise 
is calculated. Consider the construction of the image pixels given 
by b (size D x 1) 

i{h,m) 
iCk, m) 



b = 

I(Id, mo) 

from the observed data b. We can write this as 

b = f|b 



(10) 



(11) 



where the elements in the matrix T y„ (size M x D) are given as 

tt! 1 -;2;r ( Up /, + VpHi, + Wp ( y 1 -/^ -m^ - 1 ) ) 



and in the reconstruction, no smearing is assumed. Note also that 
the sets £. and £. that denote the positions of the outlier sources 
and the positions of the pixels 

£.^{{h,mi),...,(lN,mN)], £^ {Clumi),...,(lD,mD)} (13) 

have no relation to each other In general the pixels coordinates 
X where we calculate the differential noise, are on a regular grid. 
Using ([8]) and ( fTTT i. we get 



(14) 



and the difference image at frequencies /i and /o can be given as 



e = (TtT,-T;TA)b 



(15) 



assuming the variation of b is negligible within this bandwidth. 
The noise variance in the difference image is proportional to ||e|p 
and the average noise variance per pixel is Helh/D and the stan- 
dard deviation is ^J\e\\^/D. 

Considering a random distribution of outlier sources, we can 
also find the expected value of ||e|p as 

E{\\ef} = £{e^e) = ^Ib^^b} (16) 

where T y,y„ (size N x N)is given by 

T/,/„ = Re ((T) Ty, - T; T/J«(T; Ty, - T} T^j) (17) 
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and note that T/,/(, is symmetric positive semi-definite because 
of the factorization as above. We also consider T/j„ to be real 
because the sky image is real. 

We find an upper and a lower bound for /slUelP) as 

■^E{\ Tf,f„ \]E{\h\~} < £{||e||2) < £{max(diag(Ty,/„)))£{|b|2) 

(18) 

where | Ty,/„ | is the sum of all elements in Ty, /■„ and |b| is the 
sum of all elements in b. The diagonal entry with the highest 
magnitude in T/,y„ is given by max(diag(T/,/„)). The proof and 
the underlying assumptions can be found in appendixlAl 

We draw a few conclusions from ([18}: (i) By properly se- 
lecting the uv coverage and the frequencies /i and /o and also 
imaging weights (in the derivation natural weights are assumed) 
and image pixel sizes, we can change T/jg - (ii) However, |b|~ is 
entirely determined by the intensities of the outlier sources and 
the only way to minimize |bp is by suppressing the beam side- 
lobe level or by subtracting the outlier sources, as we have done 
for CasA. (iii) We get the lowest value for the noise when all 
outlier sources have intensities that are equally distributed while 
we get the highest noise when there is one very bright source 
(see appendixlAl). 

In order to relate (fTFt to NCP observations, we need to es- 
timate |bp due to the outlier sources in the images. For the par- 
ticular case of the NCP, we select the short baselines (< 1200 
wavelengths) from the uv coverage in Fig.|4] The corresponding 
image of the FOV is shown in Fig.fTO] With the same pixel size 
of 35", we have made an image of 6400 x 6400 pixels. From this 
image, we have selected all pixels that have a flux > 1 .2mJy that 
are outside the FOV and we show the selected pixels in Fig. 1211 
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Fig. 21. Pixel locations outside the 10 degree FOV of the NCP 
image where outlier sources (flux > 1 .2mJy) are detected. The 
image noise is about 0.3mJy and we have selected 24000 pixels 
out of 6400 X 6400 pixels. 



We consider the vector b to be formed by the pixels selected 
in Fig. |2T| After forming this vector, we have calculated |b| for 
different frequencies (using images made at those frequencies) 
as shown in Fig. |22] We have also fitted a model for |b| (the red 
Une) which is 
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(19) 
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Fig. 22. The variation of |b| with frequency (circles). The model 
fitted to |b| is shown by the solid red line. The length of b is 
24000 and the pixels are chosen as shown in Fig. 1211 



Using the model for |b| given by ( fT9b , we can approximate 
Zidbp). Next, using this approximation, we simulate ( fTST i, for 
various image differencing frequencies. We simulate a sky con- 
sisting of N = 500 outlier sources. Note that some of these 
sources will have intensities below the receiver noise level at any 
given frequency. But we make an important distinction that what 
matters is the cumulative effect of all the sources taken together, 
and not their individual contributions. The pixel grid for the sim- 
ulation at the center has dimensions 100 x 100, so D = 10000. 
The uv coverage is chosen to be the core (< 1200 wavelengths) 
uv coverage from Fig. |4] For a longest baseline of 1200 wave- 
lengths, the resolution is about 3' and the pixel size is chosen to 
be larg e enough ( lOQ so that we are not dominated by sampling 
errors (lYatawattall2010l) and that the noise is not correlated be- 
tween pixels. The geometry of one realization of the simulation 
is shown in Fig. [23] Note that the positions of the outlier sources 
are randomly varied in each realization. 
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Fig. 23. Outlier source positions and the pixel grid positions on 
which the differential noise is calculated. 
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We have done simulations for three different values of smear- 
ing bandwidth, A =200 kHz, A =20 kHz, and A =2 kHz. The 
bandwidth where the difference is taken is also equal to A. For 
each value of A, we generated 100 different sky realizations as 
inFig.|23l 
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Fig. 24. The noise standard deviation due to outlier source for 
different values of smearing (and differencing) bandwidth. The 
upper bound is given in (a) and the lower bound is given in (b). 
As the smearing bandwidth decreases, the bounds on the noise 
are also reduced. 



We have shown the upper and lower bounds for the standard 
deviation of the noise due to outlier sources per pixel in Fig.l24l 
We assume the noise contribution due to outlier sources at the 
highest frequency (160 MHz) is zero. As expected, as the value 
of A decreases, the noise due to far away sources is also reduced. 
So for low values of A, the result will be dominated by the re- 
ceiver noise. In future processing, we intend to use A =20 kHz 
and lower. 



all three nights of data listed in Table [T] Using all three nights of 
data for polarization analysis requires correction for the effects 
due to differential Faraday rotation which varies from night to 
night and will be done in future processing. These results act as 
a precursor of results with more dedicated EoR observations that 
are about to begin with LOFAR. 

We briefly discuss the differences between previous WSRT 
observations of the NCP (72 hours, 14 MHz) and the LOFAR ob- 
servations (10 hours effective, 48 MHz) presented in this paper 
Due to having only a 2.7 km longest baseline, the WSRT obser- 
vations were limited by classical source confusion in the con- 
tinuum. Moreover, the elevated location of the WSRT receivers, 
and the interference from the building, made them much more 
prone to RFl. The WSRT observations also used only 14 stations 
while LOFAR had about 40 stations Furthermore, the data pro- 
cessing capabilities are significantly better than what was avail- 
able to process WSRT observations. Therefore, despite the much 
longer integration time used in the WSRT observations we have 
obtained images that are much better in terms of resolution as 
well as noise. In fact, we have presented the deepest interfero- 
metric images ever made at this frequency range, to this date. 
The observed noise levels - in high resolution images, linear po- 
larization and at the edges of low resolution wide field images, 
are within a factor of 1 .4 from the theoretical noise that can be 
achieved based on the nominal system equivalent flux density 
and the effective integration time. 

Nonetheless, we can see limitations in the current results that 
can be attributed to the choices made in our processing: 



We have only a partially correct sky model that contains 
about 500 discrete sources. However, we see much more than 
this in our images but at present we are limited by the res- 
olution that is only about 12". We therefore intend to use 
more data with longer baselines (even international LOFAR 
stations) to update our sky model. 

We have thus far not seen (and ignored) the effect of (spatial) 
differential Faraday rotation (DFR) due to the ionosphere in 
our data. Note that even though we calibrate for full Jones 
matrices using SAGECal, we do not correct the data using the 
solutions. However, we shall s oon incorporate taking DFR 
into account in our calibration (lYatawattall20T2l) . We believe 
this will be crucial when we get data with significantly longer 
(80 km) baselines. 

In the processing of our data we were limited by compu- 
tational resources. We have therefore invested significant ef- 
fort in optimizing the performance of our calibration pipeline 
using GPUs to accelerate our processing. This will en- 
able us to employ sophisticated algorithms yielding better 
results. F or instance, with a sl ightly enhanced version of 
SAGECal dYatawatta et alj|2012l) . we intend to process data 
at a finer spectral resolution, therefore eliminating the band- 
width smearing visible at the edge of the FOV in our images. 
This will also decrease the effect of outlier sources, as dis- 
cussed in section|5] 



6. Discussion/Conclusions 

We have presented the first deep imaging results made with 
LOFAR that reach a noise level of about 100 yuJy/PSF both in 
total intensity and in polarization. Note that for calculation of 
the noise in polarization, we have only used one night of data 
while for calculation of the noise in total intensity, we have used 



In summary, we do not see any major obstacles to prevent us 
from going considerably deeper through much longer multiple- 
nights integration. In forthcoming publications we will investi- 
gate the properties of the sources detected in the images, make 
a detailed study of effects due to the Galactic foreground and 
determine limits on signals from the Epoch of Reionization. 
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when 

k k 

<^3'|<> 1 <^<M and 
/=i /=i 

N N 

i=\ i=\ 

are satisfied ( [Marshall et al.ll201 ll) . 
Consider the quadratic form 

0(x) = x''^Ax 



(A.3) 



(A.4) 



where A is an x real symmetric matrix. If the elements in A 
satisfy 

J]([A],,,-[A],+ij)>0, i^\,...,N,k^\,...,N-\ (A.5) 



then we say 0(x) is Schur-convex (iMarshall et al.ll20TTl) . 

If 0(x) is Schur-convex and with two vectors of positive real 
numbers, x and y, with x < y, then 



0(x) < 0(y). 

A.2. Assumptions 

Using conditional mean, we rewrite (fT&t as 



(A.6) 



(A.7) 



where fiibil-) is expectation with respect to |b|, i.e. the sum of all 
elements in b. 

We make to following assumptions in order to proceed: 

- The source fluxes (elements in b) and their positions are both 
random variables. We assume that the probability distribu- 
tion for the fluxes is statistically independent of the proba- 
bility distribution for the source positions. In other words, 
the statistical properties of b in ( IA.7b are independent of the 
statistical properties of Tj\f„- 

- We assume Tj\if, to satisfy the properties given in (IA.5) . 
More precisely, we assume that there exists a permutation 
of the rows and the columns of T/j/^ to satisfy this condition. 
Since we did not impose any ordering on the pixel positions 
in (flJl l this permutation is feasible. We do not need to find 
this permutation since the end result does not depend on it. 
But we assume that T/,yi, is permuted such that the largest 
diagonal entry is at the 1-st row and the 1-st column. 



Appendix A: Proof of dTS) 

A.1. Mathematical background 

Consider a vector of real numbers x of size A^x 1 with Xj denoting 
the i-th element. 



X = [xi,X2,...,Xn] 



(A.l) 



We can order the elements in x in descending order and construct 
a new vector 

(A.2) 



[x] - [x[i], a:[2], . . . , JC[A']] 



where > X[2] > . . . > X[n]- Given two vectors of real num- 
bers, X and y of size x 1, we say x is majorized by y, or x < y 



A.3. Derivation of bounds 

Let z = ^ and we see that z is a vector of positive elements 
that add up to 1, i.e. |z| = 1. Consider the following vectors 



z = [l/N,l/N,...,l/Ny , and 
z = [1,0,0,. ..,0]^. 

We see that from the properties of (IA.3K 

z < z < z. 



(A.8) 



(A.9) 



Together with (IA.6b and (|A.9l l, and the assumptions on Tf^^, we 
get 

z^T/,/„z < z^T/,/„z < z^T/j/„z. (A. 10) 
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We can simplify dA.lOb as 



fluxes while the upper bound is obtained when only one source 
has the total flux (or when there is one very bright source). 



^|T/,/ol < z^Tf,f,z < max(diag(Ty,/„)) (A.ll) 



where |T/,/„| is the sum of all elements in T fj„ and the diagonal 
entry with the highest magnitude of T/,/„ (at row 1 and column 
1) is given by max(diag(T/,yi,)). 



Substituting (lA.llb back to ( IA.7b and considering the inde- 
pendence of b and T/j;,, we get ( fTSl l. 



One additional point to note is that the lower bound is ob- 
tained when z - z, i.e. when the elements in b are uniformly 
distributed with equal values. Moreover, the upper bound is ob- 
tained when z - z, i.e. the total value of the elements in b is 
concentrated at one element. Reverting back to the fluxes of the 
sources, the lower bound is obtained when all sources have equal 
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